# Fuzzy c-Means Clustering for Persistence Diagrams

This repository is the official implementation of the Decision boundary experiments from the paper Fuzzy c-Means Clustering for Persistence Diagrams. 

We develop an algorithm to fuzzy cluster datasets based on their topology.
## Requirements

To install requirements:

```setup
pip install -r requirements.txt
```

## Running the Algorithm

In ```clustering.py``` we provide a function ```fpd_cluster``` that accepts a list of datasets and number of clusters as an input, and returns membership values and cluster centres.
To use it, have ```clustering.py``` in the same folder as your project and ```import fpd_cluster from clustering``` at the top of your file.

If you already have a list of persistence diagrams, you can cluster them using the ```pd_fuzzy``` function in ```clustering.py```.

## Results

As demonstrated in the paper, we show we can cluster together models and tasks using the topology of their boundaries in a way that captures information about the performance of models on tasks.

## Reproducibility

To reproduce the results for each dataset, simply run the relevant file.

## Datasets

We provide copies of [MNIST](http://yann.lecun.com/exdb/mnist/), [FashionMNIST](https://github.com/zalandoresearch/fashion-mnist), and [Kuzushiji-MNIST](https://github.com/rois-codh/kmnist).

## Licensing

All content in this repository is licensed under the MIT license.